Skip to content

BUG: Remove "Mean of empty slice" warning in nanmedian #58107

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 1, 2024

Conversation

sebastian-correa
Copy link
Contributor

@sebastian-correa sebastian-correa commented Apr 1, 2024

This PR supresses a warning when calculating the median of a pd.Series that is full of NAs. The warning is just letting you know that numpy got an empty array but correctly returns np.nan.

The warning is issued as follows:

  1. When the series has NAs and you try to calculate the median, nanops.nanmedian is called.
  2. When the array is full of NAs, the _mask in nanops.nanmedian.get_median is all False, so np.nanmedian(x[_mask]) gets an empty array.
  3. np.nanmedian calls numpy.lib.nanfunctions._nanmedian which has a short-circuit path when the array is empty [ref].
  4. This path calls np.nanmean. This creates an empty mask [ref] because the array is empty.
  5. Then, the sum of the negated mask is used as the denominator of the mean, which causes the result to be np.nan (correct!).
  6. The method issues a warning because the array was empty.

Another way to solve this would be to change get_median to explicitly return np.nan before the call to np.nanmedian, but that would involve tampering with the condition in the if, which has no comments so I'm unsure if changing it makes sense.

    def get_median(x, _mask=None):
        if _mask is None:
            _mask = notna(x)
        else:
            _mask = ~_mask
        all_na = _mask.all()
        if (not skipna and not all_na) or all_na:
            return np.nan
        with warnings.catch_warnings():
            # Suppress RuntimeWarning about All-NaN slice
            warnings.filterwarnings(
                        "ignore", "All-NaN slice encountered", RuntimeWarning
            )
            res = np.nanmedian(x[_mask])
        return res

which simplifies to if skipna: return np.nan I think.

I haven't added tests nor ensured all of the below passed not added comments because I want to ensure this is the preferred solution. Otherwise, I can edit get_median. Let me know!


When the array is full of NAs, it ends up in get_median which issues
that warning.
@mroeschke mroeschke added the Warnings Warnings that appear or should be added to pandas label Apr 1, 2024
@mroeschke
Copy link
Member

pre-commit.ci autofix

@mroeschke mroeschke added this to the 3.0 milestone Apr 1, 2024
@mroeschke mroeschke merged commit ce581e6 into pandas-dev:main Apr 1, 2024
46 checks passed
@mroeschke
Copy link
Member

Thanks @sebastian-correa

pmhatre1 pushed a commit to pmhatre1/pandas-pmhatre1 that referenced this pull request May 7, 2024
)

* Filter Mean of empty slice warning in nanmedian

When the array is full of NAs, it ends up in get_median which issues
that warning.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@sebastian-correa sebastian-correa deleted the fix-median-warning branch July 24, 2024 13:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Warnings Warnings that appear or should be added to pandas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants